Naive String Matching Algorithm
Interactive demonstration of sliding pattern search with overlapping matches.
📝 Problem Statement
Find all indices in a Text ($T$) where a Pattern ($P$) occurs. After a match is found, the pattern slides forward by **only 1 index** to check for subsequent overlapping matches.
Input Section
Initial Visualization:
💡 The Naive Sliding Approach
Core Principle:
The pattern $P$ is aligned at shift $i$ in the text $T$. We compare $P[j]$ with $T[i+j]$ character by character. If a match is found, we record the index $i$ and then slide $P$ by $i \to i+1$ to continue the search.
Comparison Steps:
- Initialize shift **$i=0$** (start of text).
- At each shift $i$, compare **$P[j]$** with **$T[i+j]$** for $j = 0$ to $M-1$.
- **If mismatch** ($P[j] \neq T[i+j]$): Stop comparison at shift $i$. Increment the shift: **$i \to i+1$**. Reset comparison index $j=0$.
- **If full match** ($j$ reaches $M$): Record index $i$. Increment the shift: **$i \to i+1$**. Reset comparison index $j=0$. (This is the rule allowing overlapping matches).
- Repeat until $i$ exceeds $N-M$.
🎬 Step-by-Step Demo
Visualization
Output Log
Final Match Indices: None
📊 Algorithm Analysis
Best Case Complexity
O(N)
Occurs when the pattern is immediately found at the start, or when mismatches occur immediately (e.g., $T=\text{AAAAAAAA...}$, $P=\text{BAA...}$). The inner loop runs only once per shift $i$.
Worst Case Complexity
O(N * M)
Occurs in cases like $T=\text{AAAAAAB}$ and $P=\text{AAAB}$. The algorithm performs almost $M$ comparisons at every shift $i$ before a mismatch forces a slide. This is inefficient for very large strings.
Why it's "Naive"
The algorithm is naive because when a mismatch occurs, it discards all prior matching information. More advanced algorithms (like KMP) use precomputed data (the failure function) to determine the optimal next slide distance, drastically improving the worst-case time complexity to $O(N)$.